[APMLP-819] Add imageresolver.Config to initialize ImageResolvers#43308
[APMLP-819] Add imageresolver.Config to initialize ImageResolvers#43308dd-mergequeue[bot] merged 25 commits intomainfrom
imageresolver.Config to initialize ImageResolvers#43308Conversation
Go Package Import DifferencesBaseline: 4e6b4c9
|
| MaxInitRetries: 5, | ||
| InitRetryDelay: 1 * time.Second, |
There was a problem hiding this comment.
This could possibly be configured via config/env var, but this is already static in current state, so retaining that for now
There was a problem hiding this comment.
Would love to see all of these configurable by ddConfig!
There was a problem hiding this comment.
The first two already are, and the last 2 will be removed with the pivot to tag-based, so I might not add them to ddConfig (unless you think it could still be useful in the meantime, then I'm open to adding them)
ImageResolverConfigimageresolver.Config to initialize ImageResolvers
Static quality checks✅ Please find below the results from static quality gates Successful checksInfo
|
Regression DetectorRegression Detector ResultsMetrics dashboard Baseline: 4e6b4c9 Optimization Goals: ✅ No significant changes detected
|
| perf | experiment | goal | Δ mean % | Δ mean % CI | trials | links |
|---|---|---|---|---|---|---|
| ➖ | docker_containers_cpu | % cpu utilization | -4.82 | [-7.70, -1.94] | 1 | Logs |
Fine details of change detection per experiment
| perf | experiment | goal | Δ mean % | Δ mean % CI | trials | links |
|---|---|---|---|---|---|---|
| ➖ | quality_gate_logs | % cpu utilization | +1.56 | [+0.09, +3.03] | 1 | Logs bounds checks dashboard |
| ➖ | quality_gate_metrics_logs | memory utilization | +0.84 | [+0.64, +1.05] | 1 | Logs bounds checks dashboard |
| ➖ | ddot_metrics_sum_cumulativetodelta_exporter | memory utilization | +0.40 | [+0.17, +0.63] | 1 | Logs |
| ➖ | tcp_syslog_to_blackhole | ingress throughput | +0.38 | [+0.30, +0.46] | 1 | Logs |
| ➖ | otlp_ingest_logs | memory utilization | +0.27 | [+0.18, +0.37] | 1 | Logs |
| ➖ | quality_gate_idle | memory utilization | +0.19 | [+0.15, +0.23] | 1 | Logs bounds checks dashboard |
| ➖ | otlp_ingest_metrics | memory utilization | +0.10 | [-0.05, +0.25] | 1 | Logs |
| ➖ | file_to_blackhole_100ms_latency | egress throughput | +0.06 | [+0.02, +0.10] | 1 | Logs |
| ➖ | file_tree | memory utilization | +0.01 | [-0.04, +0.06] | 1 | Logs |
| ➖ | uds_dogstatsd_to_api_v3 | ingress throughput | +0.01 | [-0.12, +0.14] | 1 | Logs |
| ➖ | tcp_dd_logs_filter_exclude | ingress throughput | +0.00 | [-0.07, +0.08] | 1 | Logs |
| ➖ | uds_dogstatsd_to_api | ingress throughput | -0.01 | [-0.13, +0.12] | 1 | Logs |
| ➖ | file_to_blackhole_500ms_latency | egress throughput | -0.01 | [-0.38, +0.36] | 1 | Logs |
| ➖ | file_to_blackhole_1000ms_latency | egress throughput | -0.03 | [-0.45, +0.39] | 1 | Logs |
| ➖ | ddot_logs | memory utilization | -0.03 | [-0.11, +0.04] | 1 | Logs |
| ➖ | file_to_blackhole_0ms_latency | egress throughput | -0.10 | [-0.49, +0.29] | 1 | Logs |
| ➖ | ddot_metrics | memory utilization | -0.10 | [-0.31, +0.11] | 1 | Logs |
| ➖ | quality_gate_idle_all_features | memory utilization | -0.11 | [-0.15, -0.07] | 1 | Logs bounds checks dashboard |
| ➖ | ddot_metrics_sum_cumulative | memory utilization | -0.15 | [-0.31, +0.02] | 1 | Logs |
| ➖ | ddot_metrics_sum_delta | memory utilization | -0.21 | [-0.41, -0.01] | 1 | Logs |
| ➖ | docker_containers_memory | memory utilization | -0.43 | [-0.50, -0.35] | 1 | Logs |
| ➖ | uds_dogstatsd_20mb_12k_contexts_20_senders | memory utilization | -0.50 | [-0.56, -0.45] | 1 | Logs |
| ➖ | docker_containers_cpu | % cpu utilization | -4.82 | [-7.70, -1.94] | 1 | Logs |
Bounds Checks: ✅ Passed
| perf | experiment | bounds_check_name | replicates_passed | links |
|---|---|---|---|---|
| ✅ | docker_containers_cpu | simple_check_run | 10/10 | |
| ✅ | docker_containers_memory | memory_usage | 10/10 | |
| ✅ | docker_containers_memory | simple_check_run | 10/10 | |
| ✅ | file_to_blackhole_0ms_latency | lost_bytes | 10/10 | |
| ✅ | file_to_blackhole_0ms_latency | memory_usage | 10/10 | |
| ✅ | file_to_blackhole_1000ms_latency | lost_bytes | 10/10 | |
| ✅ | file_to_blackhole_1000ms_latency | memory_usage | 10/10 | |
| ✅ | file_to_blackhole_100ms_latency | lost_bytes | 10/10 | |
| ✅ | file_to_blackhole_100ms_latency | memory_usage | 10/10 | |
| ✅ | file_to_blackhole_500ms_latency | lost_bytes | 10/10 | |
| ✅ | file_to_blackhole_500ms_latency | memory_usage | 10/10 | |
| ✅ | quality_gate_idle | intake_connections | 10/10 | bounds checks dashboard |
| ✅ | quality_gate_idle | memory_usage | 10/10 | bounds checks dashboard |
| ✅ | quality_gate_idle_all_features | intake_connections | 10/10 | bounds checks dashboard |
| ✅ | quality_gate_idle_all_features | memory_usage | 10/10 | bounds checks dashboard |
| ✅ | quality_gate_logs | intake_connections | 10/10 | bounds checks dashboard |
| ✅ | quality_gate_logs | lost_bytes | 10/10 | bounds checks dashboard |
| ✅ | quality_gate_logs | memory_usage | 10/10 | bounds checks dashboard |
| ✅ | quality_gate_metrics_logs | cpu_usage | 10/10 | bounds checks dashboard |
| ✅ | quality_gate_metrics_logs | intake_connections | 10/10 | bounds checks dashboard |
| ✅ | quality_gate_metrics_logs | lost_bytes | 10/10 | bounds checks dashboard |
| ✅ | quality_gate_metrics_logs | memory_usage | 10/10 | bounds checks dashboard |
Explanation
Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%
Performance changes are noted in the perf column of each table:
- ✅ = significantly better comparison variant performance
- ❌ = significantly worse comparison variant performance
- ➖ = no significant change in performance
A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".
For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:
-
Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
-
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
-
Its configuration does not mark it "erratic".
CI Pass/Fail Decision
✅ Passed. All Quality Gates passed.
- quality_gate_idle, bounds check memory_usage: 10/10 replicas passed. Gate passed.
- quality_gate_idle, bounds check intake_connections: 10/10 replicas passed. Gate passed.
- quality_gate_metrics_logs, bounds check intake_connections: 10/10 replicas passed. Gate passed.
- quality_gate_metrics_logs, bounds check cpu_usage: 10/10 replicas passed. Gate passed.
- quality_gate_metrics_logs, bounds check memory_usage: 10/10 replicas passed. Gate passed.
- quality_gate_metrics_logs, bounds check lost_bytes: 10/10 replicas passed. Gate passed.
- quality_gate_logs, bounds check intake_connections: 10/10 replicas passed. Gate passed.
- quality_gate_logs, bounds check memory_usage: 10/10 replicas passed. Gate passed.
- quality_gate_logs, bounds check lost_bytes: 10/10 replicas passed. Gate passed.
- quality_gate_idle_all_features, bounds check intake_connections: 10/10 replicas passed. Gate passed.
- quality_gate_idle_all_features, bounds check memory_usage: 10/10 replicas passed. Gate passed.
…/datadog-agent into erikayasuda/imageresolver_config
pkg/clusteragent/admission/mutate/autoinstrumentation/image_resolver.go
Outdated
Show resolved
Hide resolved
| // container image references from mutable tags to digests. | ||
| package imageresolver | ||
|
|
||
| func newDatadoghqRegistries(datadogRegistriesList []string) map[string]struct{} { |
There was a problem hiding this comment.
Note: This is a duplicate of the one that already exists in the image_resolver.go file, but maintaining both until we move all relevant components under the imageresolver package
imageresolver.Config to initialize ImageResolversimageresolver.Config to initialize ImageResolvers
betterengineering
left a comment
There was a problem hiding this comment.
Looks good! Thanks for refactoring! Feel free to address my comments in this change or as a follow up.
pkg/clusteragent/admission/mutate/autoinstrumentation/imageresolver/config.go
Show resolved
Hide resolved
| } | ||
|
|
||
| imageResolver := NewImageResolver(rcClient, datadogConfig) | ||
| imageResolver := NewImageResolver(imageresolver.NewConfig(datadogConfig, rcClient)) |
There was a problem hiding this comment.
Isn't this nice that you only had to touch this here 👀
| MaxInitRetries: 5, | ||
| InitRetryDelay: 1 * time.Second, |
There was a problem hiding this comment.
Would love to see all of these configurable by ddConfig!
pkg/clusteragent/admission/mutate/autoinstrumentation/imageresolver/config.go
Outdated
Show resolved
Hide resolved
| type Config struct { | ||
| Site string | ||
| DDRegistries map[string]struct{} | ||
| RCClient RemoteConfigClient |
There was a problem hiding this comment.
The client ultimately shouldn't be here. The config struct should be purely config. Dependencies should be passed in when you make an image resolver
There was a problem hiding this comment.
Yeah that makes sense. My intent here was to add it in the imageresolver.Config since we know we want to remove it when we pivot to tag-based, so I wanted the changes to be in less places 😅
|
/merge |
|
View all feedbacks in Devflow UI.
This pull request is not mergeable according to GitHub. Common reasons include pending required checks, missing approvals, or merge conflicts — but it could also be blocked by other repository rules or settings.
The expected merge time in
Build pipeline has failing jobs for f91991e: What to do next?
DetailsSince those jobs are not marked as being allowed to fail, the pipeline will most likely fail. |
|
/merge |
|
View all feedbacks in Devflow UI.
The expected merge time in
Build pipeline has failing jobs for c1b8901: What to do next?
DetailsSince those jobs are not marked as being allowed to fail, the pipeline will most likely fail. |
|
/merge |
|
View all feedbacks in Devflow UI.
The expected merge time in
|
…ver` package (#45151) ### What does this PR do? Continuing the refactor of the `ImageResolver` code into the `imageresolver` package for clear delineation of the code. See first part [here](#43308). Namely, the changes are: - Moving `image_resolver.go` and `image_resolver_test.go` into `imageresolver/` - Renaming functions and types to avoid stutter like `imageresolver.NewImageResolver(...)` - Creating an `image.go` file to separate definitions representing images handled by the Resolvers - Updating references outside of the `imageresolver` package to reflect the above changes No functional changes are introduced by this PR, only refactor/clean up. ### Motivation The original `imageresolver` code was hard to read, and the goal here is to make this easier to read and maintain for all repo contributors. This is also to help make future changes easier to implement along with the general refactor initiative for the autoinstrumentation cluster agent code. ### Describe how you validated your changes Verified existing tests all pass. ### Additional Notes Co-authored-by: erika.yasuda <erika.yasuda@datadoghq.com>
What does this PR do?
Adds a new
Configtype in a separateimageresolverpackage. The config should contain all necessary configurations to determine what type ofImageResolverimplementation should be used.There is no functionality/behavior change from this PR.
Motivation
NewImageResolver()now only requires theImageResolverConfig, which can be updated later if needed.imageresolverpackage -- Starting this so other ImageResolver specific code can be nested in here in a follow up PRDescribe how you validated your changes
Local E2E validation using
injector-devto ensure existing behavior is retainedAdditional Notes